{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from phoenix.otel import register\n",
    "\n",
    "tracer_provider = register()\n",
    "tracer = tracer_provider.get_tracer(__name__)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## LLMs"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Like other span kinds, LLM spans can be instrumented either via a context manager or via a decorator pattern. It's also possible to directly patch client methods.\n",
    "\n",
    "While this guide uses the OpenAI Python client for illustration, in practice, you should use the OpenInference auto-instrumentors for OpenAI whenever possible and resort to manual instrumentation for LLM spans only as a last resort.\n",
    "\n",
    "To run the snippets in this section, set your `OPENAI_API_KEY` environment variable."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Context Manager"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "from openai import OpenAI\n",
    "from opentelemetry.trace import Status, StatusCode\n",
    "\n",
    "openai_client = OpenAI()\n",
    "\n",
    "messages = [{\"role\": \"user\", \"content\": \"Hello, world!\"}]\n",
    "with tracer.start_as_current_span(\"llm_span\", openinference_span_kind=\"llm\") as span:\n",
    "    span.set_input(messages)\n",
    "    try:\n",
    "        response = openai_client.chat.completions.create(\n",
    "            model=\"gpt-4\",\n",
    "            messages=messages,\n",
    "        )\n",
    "    except Exception as error:\n",
    "        span.record_exception(error)\n",
    "        span.set_status(Status(StatusCode.ERROR))\n",
    "    else:\n",
    "        span.set_output(response)\n",
    "        span.set_status(Status(StatusCode.OK))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Decorator"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from typing import List\n",
    "\n",
    "from openai import OpenAI\n",
    "from openai.types.chat import ChatCompletionMessageParam\n",
    "\n",
    "openai_client = OpenAI()\n",
    "\n",
    "\n",
    "@tracer.llm\n",
    "def invoke_llm(\n",
    "    messages: List[ChatCompletionMessageParam],\n",
    ") -> str:\n",
    "    response = openai_client.chat.completions.create(\n",
    "        model=\"gpt-4o\",\n",
    "        messages=messages,\n",
    "    )\n",
    "    message = response.choices[0].message\n",
    "    return message.content or \"\"\n",
    "\n",
    "\n",
    "invoke_llm([{\"role\": \"user\", \"content\": \"Hello, world!\"}])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This decorator pattern above works for sync functions, async coroutine functions, sync generator functions, and async generator functions. Here's an example with an async generator."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from typing import AsyncGenerator, List\n",
    "\n",
    "from openai import AsyncOpenAI\n",
    "from openai.types.chat import ChatCompletionMessageParam\n",
    "\n",
    "openai_async_client = AsyncOpenAI()\n",
    "\n",
    "\n",
    "@tracer.llm\n",
    "async def stream_llm_responses(\n",
    "    messages: List[ChatCompletionMessageParam],\n",
    ") -> AsyncGenerator[str, None]:\n",
    "    stream = await openai_async_client.chat.completions.create(\n",
    "        model=\"gpt-4o\",\n",
    "        messages=messages,\n",
    "        stream=True,\n",
    "    )\n",
    "    async for chunk in stream:\n",
    "        if chunk.choices[0].delta.content:\n",
    "            yield chunk.choices[0].delta.content\n",
    "\n",
    "\n",
    "# invoke inside of an async context\n",
    "async for token in stream_llm_responses([{\"role\": \"user\", \"content\": \"Hello, world!\"}]):\n",
    "    print(token, end=\"\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Method Patch"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It's also possible to directly patch methods on a client. This is useful if you want to transparently use the client in your application with instrumentation logic localized in one place."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from openai import OpenAI\n",
    "\n",
    "openai_client = OpenAI()\n",
    "\n",
    "# patch the create method\n",
    "wrapper = tracer.llm\n",
    "openai_client.chat.completions.create = wrapper(openai_client.chat.completions.create)\n",
    "\n",
    "# invoke the patched method normally\n",
    "openai_client.chat.completions.create(\n",
    "    model=\"gpt-4o\",\n",
    "    messages=[{\"role\": \"user\", \"content\": \"Hello, world!\"}],\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The snippets above produce LLM spans with input and output values, but don't offer rich UI for messages, tools, invocation parameters, etc. In order to manually instrument LLM spans with these features, users can define their own functions to wrangle the input and output of their LLM calls into OpenInference format. The `openinference-instrumentation` library contains helper functions that produce valid OpenInference attributes for LLM spans:\n",
    "\n",
    "- `get_llm_attributes`\n",
    "- `get_input_attributes`\n",
    "- `get_output_attributes`\n",
    "\n",
    "For OpenAI, these functions might look like this:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "from typing import Any, Dict, List, Optional, Union\n",
    "\n",
    "from openai.types.chat import (\n",
    "    ChatCompletion,\n",
    "    ChatCompletionMessage,\n",
    "    ChatCompletionMessageParam,\n",
    "    ChatCompletionToolParam,\n",
    ")\n",
    "from opentelemetry.util.types import AttributeValue\n",
    "\n",
    "import openinference.instrumentation as oi\n",
    "from openinference.instrumentation import (\n",
    "    get_input_attributes,\n",
    "    get_llm_attributes,\n",
    "    get_output_attributes,\n",
    ")\n",
    "\n",
    "\n",
    "def process_input(\n",
    "    messages: List[ChatCompletionMessageParam],\n",
    "    model: str,\n",
    "    temperature: Optional[float] = None,\n",
    "    tools: Optional[List[ChatCompletionToolParam]] = None,\n",
    "    **kwargs: Any,\n",
    ") -> Dict[str, AttributeValue]:\n",
    "    oi_messages = [convert_openai_message_to_oi_message(message) for message in messages]\n",
    "    oi_tools = [convert_openai_tool_param_to_oi_tool(tool) for tool in tools or []]\n",
    "    return {\n",
    "        **get_input_attributes(\n",
    "            {\n",
    "                \"messages\": messages,\n",
    "                \"model\": model,\n",
    "                \"temperature\": temperature,\n",
    "                \"tools\": tools,\n",
    "                **kwargs,\n",
    "            }\n",
    "        ),\n",
    "        **get_llm_attributes(\n",
    "            provider=\"openai\",\n",
    "            system=\"openai\",\n",
    "            model_name=model,\n",
    "            input_messages=oi_messages,\n",
    "            invocation_parameters={\"temperature\": temperature},\n",
    "            tools=oi_tools,\n",
    "        ),\n",
    "    }\n",
    "\n",
    "\n",
    "def convert_openai_message_to_oi_message(\n",
    "    message_param: Union[ChatCompletionMessageParam, ChatCompletionMessage],\n",
    ") -> oi.Message:\n",
    "    if isinstance(message_param, ChatCompletionMessage):\n",
    "        role: str = message_param.role\n",
    "        oi_message = oi.Message(role=role)\n",
    "        if isinstance(content := message_param.content, str):\n",
    "            oi_message[\"content\"] = content\n",
    "        if message_param.tool_calls is not None:\n",
    "            oi_tool_calls: List[oi.ToolCall] = []\n",
    "            for tool_call in message_param.tool_calls:\n",
    "                function = tool_call.function\n",
    "                oi_tool_calls.append(\n",
    "                    oi.ToolCall(\n",
    "                        id=tool_call.id,\n",
    "                        function=oi.ToolCallFunction(\n",
    "                            name=function.name,\n",
    "                            arguments=function.arguments,\n",
    "                        ),\n",
    "                    )\n",
    "                )\n",
    "            oi_message[\"tool_calls\"] = oi_tool_calls\n",
    "        return oi_message\n",
    "\n",
    "    role = message_param[\"role\"]\n",
    "    assert isinstance(message_param[\"content\"], str)\n",
    "    content = message_param[\"content\"]\n",
    "    return oi.Message(role=role, content=content)\n",
    "\n",
    "\n",
    "def convert_openai_tool_param_to_oi_tool(tool_param: ChatCompletionToolParam) -> oi.Tool:\n",
    "    assert tool_param[\"type\"] == \"function\"\n",
    "    return oi.Tool(json_schema=dict(tool_param))\n",
    "\n",
    "\n",
    "def process_output(response: ChatCompletion) -> Dict[str, AttributeValue]:\n",
    "    message = response.choices[0].message\n",
    "    role = message.role\n",
    "    oi_message = oi.Message(role=role)\n",
    "    if isinstance(message.content, str):\n",
    "        oi_message[\"content\"] = message.content\n",
    "    if isinstance(message.tool_calls, list):\n",
    "        oi_tool_calls: List[oi.ToolCall] = []\n",
    "        for tool_call in message.tool_calls:\n",
    "            tool_call_id = tool_call.id\n",
    "            function_name = tool_call.function.name\n",
    "            function_arguments = tool_call.function.arguments\n",
    "            oi_tool_calls.append(\n",
    "                oi.ToolCall(\n",
    "                    id=tool_call_id,\n",
    "                    function=oi.ToolCallFunction(\n",
    "                        name=function_name,\n",
    "                        arguments=function_arguments,\n",
    "                    ),\n",
    "                )\n",
    "            )\n",
    "        oi_message[\"tool_calls\"] = oi_tool_calls\n",
    "    output_messages = [oi_message]\n",
    "    token_usage = response.usage\n",
    "    oi_token_count: Optional[oi.TokenCount] = None\n",
    "    if token_usage is not None:\n",
    "        prompt_tokens = token_usage.prompt_tokens\n",
    "        completion_tokens = token_usage.completion_tokens\n",
    "        oi_token_count = oi.TokenCount(\n",
    "            prompt=prompt_tokens,\n",
    "            completion=completion_tokens,\n",
    "        )\n",
    "    return {\n",
    "        **get_llm_attributes(\n",
    "            output_messages=output_messages,\n",
    "            token_count=oi_token_count,\n",
    "        ),\n",
    "        **get_output_attributes(response),\n",
    "    }"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Context Manager\n",
    "\n",
    "When using a context manager to create LLM spans, these functions can be used to wrangle inputs and outputs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import json\n",
    "\n",
    "from openai import OpenAI\n",
    "from openai.types.chat import (\n",
    "    ChatCompletionMessage,\n",
    "    ChatCompletionMessageParam,\n",
    "    ChatCompletionToolMessageParam,\n",
    "    ChatCompletionToolParam,\n",
    "    ChatCompletionUserMessageParam,\n",
    ")\n",
    "from opentelemetry.trace import Status, StatusCode\n",
    "\n",
    "openai_client = OpenAI()\n",
    "\n",
    "\n",
    "@tracer.tool\n",
    "def get_weather(city: str) -> str:\n",
    "    # make an call to a weather API here\n",
    "    return \"sunny\"\n",
    "\n",
    "\n",
    "messages: List[Union[ChatCompletionMessage, ChatCompletionMessageParam]] = [\n",
    "    ChatCompletionUserMessageParam(\n",
    "        role=\"user\",\n",
    "        content=\"What's the weather like in San Francisco?\",\n",
    "    )\n",
    "]\n",
    "temperature = 0.5\n",
    "invocation_parameters = {\"temperature\": temperature}\n",
    "tools: List[ChatCompletionToolParam] = [\n",
    "    {\n",
    "        \"type\": \"function\",\n",
    "        \"function\": {\n",
    "            \"name\": \"get_weather\",\n",
    "            \"description\": \"finds the weather for a given city\",\n",
    "            \"parameters\": {\n",
    "                \"type\": \"object\",\n",
    "                \"properties\": {\n",
    "                    \"city\": {\n",
    "                        \"type\": \"string\",\n",
    "                        \"description\": \"The city to find the weather for, e.g. 'London'\",\n",
    "                    }\n",
    "                },\n",
    "                \"required\": [\"city\"],\n",
    "            },\n",
    "        },\n",
    "    },\n",
    "]\n",
    "\n",
    "with tracer.start_as_current_span(\n",
    "    \"llm_tool_call\",\n",
    "    attributes=process_input(\n",
    "        messages=messages,\n",
    "        invocation_parameters={\"temperature\": temperature},\n",
    "        model=\"gpt-4\",\n",
    "    ),\n",
    "    openinference_span_kind=\"llm\",\n",
    ") as span:\n",
    "    try:\n",
    "        response = openai_client.chat.completions.create(\n",
    "            model=\"gpt-4o\",\n",
    "            messages=messages,\n",
    "            temperature=temperature,\n",
    "            tools=tools,\n",
    "        )\n",
    "    except Exception as error:\n",
    "        span.record_exception(error)\n",
    "        span.set_status(Status(StatusCode.ERROR))\n",
    "    else:\n",
    "        span.set_attributes(process_output(response))\n",
    "        span.set_status(Status(StatusCode.OK))\n",
    "\n",
    "output_message = response.choices[0].message\n",
    "tool_calls = output_message.tool_calls\n",
    "assert tool_calls and len(tool_calls) == 1\n",
    "tool_call = tool_calls[0]\n",
    "city = json.loads(tool_call.function.arguments)[\"city\"]\n",
    "weather = get_weather(city)\n",
    "messages.append(output_message)\n",
    "messages.append(\n",
    "    ChatCompletionToolMessageParam(\n",
    "        content=weather,\n",
    "        role=\"tool\",\n",
    "        tool_call_id=tool_call.id,\n",
    "    )\n",
    ")\n",
    "\n",
    "with tracer.start_as_current_span(\n",
    "    \"tool_call_response\",\n",
    "    attributes=process_input(\n",
    "        messages=messages,\n",
    "        invocation_parameters={\"temperature\": temperature},\n",
    "        model=\"gpt-4\",\n",
    "    ),\n",
    "    openinference_span_kind=\"llm\",\n",
    ") as span:\n",
    "    try:\n",
    "        response = openai_client.chat.completions.create(\n",
    "            model=\"gpt-4o\",\n",
    "            messages=messages,\n",
    "            temperature=temperature,\n",
    "        )\n",
    "    except Exception as error:\n",
    "        span.record_exception(error)\n",
    "        span.set_status(Status(StatusCode.ERROR))\n",
    "    else:\n",
    "        span.set_attributes(process_output(response))\n",
    "        span.set_status(Status(StatusCode.OK))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Decorator\n",
    "\n",
    "When using the `tracer.llm` decorator, these functions are passed via the `process_input` and `process_output` parameters and should satisfy the following:\n",
    "\n",
    "- The input signature of `process_input` should exactly match the input signature of the decorated function.\n",
    "- The input signature of `process_output` has a single argument, the output of the decorated function. This argument accepts the returned value when the decorated function is a sync or async function, or a list of yielded values when the decorated function is a sync or async generator function.\n",
    "- Both `process_input` and `process_output` should output a dictionary mapping attribute names to values.\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from openai import NOT_GIVEN, OpenAI\n",
    "from openai.types.chat import ChatCompletion\n",
    "\n",
    "openai_client = OpenAI()\n",
    "\n",
    "\n",
    "@tracer.llm(\n",
    "    process_input=process_input,\n",
    "    process_output=process_output,\n",
    ")\n",
    "def invoke_llm(\n",
    "    messages: List[ChatCompletionMessageParam],\n",
    "    model: str,\n",
    "    temperature: Optional[float] = None,\n",
    "    tools: Optional[List[ChatCompletionToolParam]] = None,\n",
    ") -> ChatCompletion:\n",
    "    response: ChatCompletion = openai_client.chat.completions.create(\n",
    "        messages=messages,\n",
    "        model=model,\n",
    "        tools=tools or NOT_GIVEN,\n",
    "        temperature=temperature,\n",
    "    )\n",
    "    return response\n",
    "\n",
    "\n",
    "invoke_llm(\n",
    "    messages=[{\"role\": \"user\", \"content\": \"Hello, world!\"}],\n",
    "    temperature=0.5,\n",
    "    model=\"gpt-4\",\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "When decorating a generator function, `process_output` should accept a single argument, a list of the values yielded by the decorated function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "from typing import Dict, List, Optional\n",
    "\n",
    "from openai.types.chat import ChatCompletionChunk\n",
    "from opentelemetry.util.types import AttributeValue\n",
    "\n",
    "import openinference.instrumentation as oi\n",
    "\n",
    "\n",
    "def process_generator_output(\n",
    "    outputs: List[ChatCompletionChunk],\n",
    ") -> Dict[str, AttributeValue]:\n",
    "    role: Optional[str] = None\n",
    "    content = \"\"\n",
    "    oi_token_count = oi.TokenCount()\n",
    "    for chunk in outputs:\n",
    "        if choices := chunk.choices:\n",
    "            assert len(choices) == 1\n",
    "            delta = choices[0].delta\n",
    "            if isinstance(delta.content, str):\n",
    "                content += delta.content\n",
    "            if isinstance(delta.role, str):\n",
    "                role = delta.role\n",
    "        if (usage := chunk.usage) is not None:\n",
    "            if (prompt_tokens := usage.prompt_tokens) is not None:\n",
    "                oi_token_count[\"prompt\"] = prompt_tokens\n",
    "            if (completion_tokens := usage.completion_tokens) is not None:\n",
    "                oi_token_count[\"completion\"] = completion_tokens\n",
    "    oi_messages = []\n",
    "    if role and content:\n",
    "        oi_messages.append(oi.Message(role=role, content=content))\n",
    "    return {\n",
    "        **get_llm_attributes(\n",
    "            output_messages=oi_messages,\n",
    "            token_count=oi_token_count,\n",
    "        ),\n",
    "        **get_output_attributes(content),\n",
    "    }"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Then the decoration is the same as before."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from typing import AsyncGenerator\n",
    "\n",
    "from openai import AsyncOpenAI\n",
    "from openai.types.chat import ChatCompletionChunk\n",
    "\n",
    "openai_async_client = AsyncOpenAI()\n",
    "\n",
    "\n",
    "@tracer.llm(\n",
    "    process_input=process_input,  # same as before\n",
    "    process_output=process_generator_output,\n",
    ")\n",
    "async def stream_llm_response(\n",
    "    messages: List[ChatCompletionMessageParam],\n",
    "    model: str,\n",
    "    temperature: Optional[float] = None,\n",
    ") -> AsyncGenerator[ChatCompletionChunk, None]:\n",
    "    async for chunk in await openai_async_client.chat.completions.create(\n",
    "        messages=messages,\n",
    "        model=model,\n",
    "        temperature=temperature,\n",
    "        stream=True,\n",
    "    ):\n",
    "        yield chunk\n",
    "\n",
    "\n",
    "async for chunk in stream_llm_response(\n",
    "    messages=[{\"role\": \"user\", \"content\": \"Hello, world!\"}],\n",
    "    temperature=0.5,\n",
    "    model=\"gpt-4\",\n",
    "):\n",
    "    print(chunk)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Method Patch\n",
    "\n",
    "As before, it's possible to directly patch the method on the client. Just ensure that the input signatures of `process_input` and the patched method match."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from openai import OpenAI\n",
    "from openai.types.chat import ChatCompletionMessageParam\n",
    "\n",
    "openai_client = OpenAI()\n",
    "\n",
    "# patch the create method\n",
    "wrapper = tracer.llm(\n",
    "    process_input=process_input,\n",
    "    process_output=process_output,\n",
    ")\n",
    "openai_client.chat.completions.create = wrapper(openai_client.chat.completions.create)\n",
    "\n",
    "# invoke the patched method normally\n",
    "openai_client.chat.completions.create(\n",
    "    model=\"gpt-4o\",\n",
    "    messages=[{\"role\": \"user\", \"content\": \"Hello, world!\"}],\n",
    ")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": ".venv",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.18"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
